probability model
- North America > United States > California > Alameda County > Berkeley (0.04)
- North America > Canada > Quebec > Montreal (0.04)
- Asia > Middle East > Jordan (0.04)
Uncertainty-Aware Active Source Tracking of Marine Pollution using Unmanned Surface Vehicles
Ma, Song, Wang, Yanchao, Bucknall, Richard, Liu, Yuanchang
Abstract-- This paper proposes an uncertainty-aware marine pollution source tracking framework for unmanned surface vehicles (USVs). By integrating high-fidelity marine pollution dispersion simulation with informative path planning techniques, we demonstrate effective identification of pollution sources in marine environments. The proposed approach is implemented based on Robot Operating System (ROS), processing real-time sensor data to update probabilistic source location estimates. Experiments conducted in simulated environments with varying source locations, wave conditions, and starting positions demonstrate the framework's ability to localise pollution sources with high accuracy. Results show that the proposed approach achieves reliable source localisation efficiently and outperforms the existing baseline. This work contributes to the development of full autonomous environmental monitoring capabilities essential for rapid response to marine pollution incidents. Pollution discharged into the marine environment causes severe consequences to ecosystems [1], [2] and human health [3].
- Europe > United Kingdom (0.14)
- North America > United States > Massachusetts > Suffolk County > Boston (0.04)
- Asia > China (0.04)
- Asia > Bangladesh (0.04)
- North America > United States > New York (0.04)
- Africa > Ethiopia > Addis Ababa > Addis Ababa (0.04)
Model averaging in the space of probability distributions
Androulakis, Emmanouil, Papayiannis, Georgios I., Yannacopoulos, Athanasios N.
In the modern era, the complexity and density of data structures have significantly increased, particularly with the advent of technologies such as cloud computing, sensor networks and manifold-based data representations. A notable case within this landscape is the class of measure-valued data, which encompasses data best represented through probability distributions rather than individual observations (Ranjan and Gneiting, 2010; Gneiting and Ranjan, 2013). This framework is prevalent across various fields, including actuarial science, economics and finance, environmental sciences, etc where uncertainty and heterogeneity are inherent and models must reflect the full distributional information. For instance, in economics integrating diverse models allows for the generation of numerous meaningfull probabilistic scenarios that can effectively inform future decision-making (Moral-Benito, 2015; Hong and Martin, 2017; Christensen et al., 2018; Steel, 2020; Koundouri et al., 2024). In environmental sciences, the prediction of future states through stochastic simulation models is crucial for evaluating the consequences of natural hazards (Muis et al., 2015; Hsiang et al., 2017; Fronzek et al., 2022) or improving climatic forecasts (Friederichs and Thorarinsdottir, 2012; Scheuerer and M oller, 2015; Papayiannis et al., 2018).
- North America > United States (0.14)
- Europe > Norway (0.04)
- Africa > Middle East > Egypt (0.04)
- Africa > Ethiopia (0.04)
Data-Dependent Hidden Markov Model with Off-Road State Determination and Real-Time Viterbi Algorithm for Lane Determination in Autonomous Vehicles
Stas, Mike, Hu, Wang, Farrell, Jay A.
Lane determination and lane sequence determination are important components for many Connected and Automated Vehicle (CAV) applications. Lane determination has been solved using Hidden Markov Model (HMM) among other methods. The existing HMM literature for lane sequence determination uses empirical definitions with user-modified parameters to calculate HMM probabilities. The probability definitions in the literature can cause breaks in the HMM due to the inability to directly calculate probabilities of off-road positions, requiring post-processing of data. This paper develops a time-varying HMM using the physical properties of the roadway and vehicle, and the stochastic properties of the sensors. This approach yields emission and transition probability models conditioned on the sensor data without parameter tuning. It also accounts for the probability that the vehicle is not in any roadway lane (e.g., on the shoulder or making a U-turn), which eliminates the need for post-processing to deal with breaks in the HMM processing. This approach requires adapting the Viterbi algorithm and the HMM to be conditioned on the sensor data, which are then used to generate the most-likely sequence of lanes the vehicle has traveled. The proposed approach achieves an average accuracy of 95.9%. Compared to the existing literature, this provides an average increase of 2.25% by implementing the proposed transition probability and an average increase of 5.1% by implementing both the proposed transition and emission probabilities.
- North America > United States > California > Riverside County > Riverside (0.28)
- North America > United States > Iowa (0.04)
- North America > United States > California > Santa Clara County > Palo Alto (0.04)
- Asia > China > Beijing > Beijing (0.04)
- Transportation (1.00)
- Government > Regional Government > North America Government > United States Government (0.67)
- Automobiles & Trucks (0.67)
Preference Optimization via Contrastive Divergence: Your Reward Model is Secretly an NLL Estimator
Chen, Zhuotong, Liu, Fang, Zhu, Xuan, Qi, Yanjun, Ghavamzadeh, Mohammad
Existing studies on preference optimization (PO) have centered on constructing pairwise preference data following simple heuristics, such as maximizing the margin between preferred and dispreferred completions based on human (or AI) ranked scores. However, none of these heuristics has a full theoretical justification. In this work, we develop a novel PO framework that provides theoretical guidance to effectively sample dispreferred completions. To achieve this, we formulate PO as minimizing the negative log-likelihood (NLL) of a probability model and propose to estimate its normalization constant via a sampling strategy. As we will demonstrate, these estimative samples can act as dispreferred completions in PO. We then select contrastive divergence (CD) as the sampling strategy, and propose a novel MC-PO algorithm that applies the Monte Carlo (MC) kernel from CD to sample hard negatives w.r.t. the parameterized reward model. Finally, we propose the OnMC-PO algorithm, an extension of MC-PO to the online setting. On popular alignment benchmarks, MC-PO outperforms existing SOTA baselines, and OnMC-PO leads to further improvement.
- North America > United States > California > Santa Clara County > Sunnyvale (0.04)
- North America > United States > California > Santa Clara County > Santa Clara (0.04)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.71)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.69)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)
Reviews: Spectral Learning of Dynamic Systems from Nonequilibrium Data
Update after author feedback: I thank the authors for their constructive response and hope they can incorporate as many of the promised changes as possible. Based on the feedback and reviewer discussion I now have a better idea of the motivation behind the work. Perhaps a brief description of a concrete example application problem motivating the work would make it easier for others more used to machine learning type learning of dynamical models to appreciate the work too. The experimental evaluation would benefit a lot from explicit comparisons with Bayesian alternatives (e.g. Ruttor et al., NIPS 2013; Svensson et al., AISTATS 2016 and references therein) to properly understand the pros and cons of the different approaches.
Reward Modeling with Ordinal Feedback: Wisdom of the Crowd
Liu, Shang, Pan, Yu, Chen, Guanting, Li, Xiaocheng
Learning a reward model (RM) from human preferences has been an important component in aligning large language models (LLMs). The canonical setup of learning RMs from pairwise preference data is rooted in the classic Bradley-Terry (BT) model that accepts binary feedback, i.e., the label being either Response 1 is better than Response 2, or the opposite. Such a setup inevitably discards potentially useful samples (such as "tied" between the two responses) and loses more fine-grained information (such as "slightly better"). In this paper, we propose a framework for learning RMs under ordinal feedback which generalizes the case of binary preference feedback to any arbitrary granularity. Specifically, we first identify a marginal unbiasedness condition, which generalizes the assumption of the BT model in the existing binary feedback setting. The condition validates itself via the sociological concept of the wisdom of the crowd. Under the condition, we develop a natural probability model for pairwise preference data under ordinal feedback and analyze its properties. We prove the statistical benefits of ordinal feedback in terms of reducing the Rademacher complexity compared to the case of binary feedback. The proposed learning objective and the theory also extend to hinge loss and direct policy optimization (DPO). In particular, the theoretical analysis may be of independent interest when applying to a seemingly unrelated problem of knowledge distillation to interpret the bias-variance trade-off therein. The framework also sheds light on writing guidance for human annotators. Our numerical experiments validate that fine-grained feedback leads to better reward learning for both in-distribution and out-of-distribution settings. Further experiments show that incorporating a certain proportion of samples with tied preference boosts RM learning.
- Europe > Greece > Attica > Athens (0.04)
- North America > United States > North Carolina (0.04)
- Europe > United Kingdom > England (0.04)
A Temporal Stochastic Bias Correction using a Machine Learning Attention model
Nivron, Omer, Wischik, Damon J., Vrac, Mathieu, Shuckburgh, Emily, Archibald, Alex T.
Climate models are biased with respect to real-world observations. They usually need to be adjusted before being used in impact studies. The suite of statistical methods that enable such adjustments is called bias correction (BC). However, BC methods currently struggle to adjust temporal biases. Because they mostly disregard the dependence between consecutive time points. As a result, climate statistics with long-range temporal properties, such as heatwave duration and frequency, cannot be corrected accurately. This makes it more difficult to produce reliable impact studies on such climate statistics. This paper offers a novel BC methodology to correct temporal biases. This is made possible by rethinking the philosophy behind BC. We will introduce BC as a time-indexed regression task with stochastic outputs. Rethinking BC enables us to adapt state-of-the-art machine learning (ML) attention models and thereby learn different types of biases, including temporal asynchronicities. With a case study of heatwave duration statistics in Abuja, Nigeria, and Tokyo, Japan, we show more accurate results than current climate model outputs and alternative BC methods.
- Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.57)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.28)
- Africa > Nigeria > Federal Capital Territory > Abuja (0.26)
- (5 more...)